The Visual System's Measurement of Invariants Need Not Itself Be Invariant
نویسندگان
چکیده
When two shapes that differ in orientation or size have to be compared or objects have to be recognized from different viewpoints, the response time and error rate are systematically affected by the size of the geometric difference. In this report, we argue that these effects are not necessarily solid evidence for the use of mental transformations and against the use of invariants by the visual system. We report an experiment in which observers were asked to give afflne-invariant coordinates of a point located in an affine frame defined by three other points. The angle subtended by the coordinate axes and the ratio of the lengths of their unit vectors systematically affected the measurement errors. This finding demonstrates that the visual system's measurement of invariants need not itself be invariant. An important probietn for visual perception is how to establish a constant visual world from the continuously chatiging available information. For objects to be recognized, for example, the visua! system must somehow deal with the changing projections depending on the point of observation. There is considerable debate abotit how this is done (see Tarr, 1995, for a review). According to one approach to shape constancy, the perceptual system makes use of features of the projected image or attributes of the optic array that remain unchanged, or invariant, under changes in viewpoint (e.g., Gibson, 1950, 1979). Despite some psychophysical evidence supporting this position (e,g,. Cutting, 1986; Pizlo, 1994) and the current popularity of invariants in computer vision (e,g,, Mundy & Zisserman, 1992; Van Gool, Moons, Pauweis, & Wagemans, 1994), the dominating belief seems to be that object recognition cannot be based on invariants because objects are harder to recognize from some viewpoints than from others. Typically, increasing recognition latencies and error rates are observed with an increasing orientation difference between a previously learned or standard orientation of an object and a subsequently viewed version of it (e,g,. Cooper, 1976; Jolicoeur, 1985; Jolicoeur & Landau, 1984), These results have been interpreted as solid empirical evidence for an alternative class of theories according to which different views of objects are matched through a mental transformation or normalization process (e,g,, Tarr & Pinker, 1989, 1990; Ullman, 1989), This view makes extensive reference to the way Shepard and Cooper (1982) interpreted the effects of orientation disparity in handedness discrimination tasks, namely, as evidence for mental rotation. In this report, we argue that the frequently observed effects of parametric differences on the difiBculty of the matching task Address correspondence to Johan Wagemans, Laboratory of Experimental Psychology, University of Leuven, Tiensestraat 102, B-3(W0 Leuven, Belgium; e-mail: johan.wagemans(a)psy.kuleuven.ac.be. (as measured by response times and error rates) need not be solid evidence against the visual system's use of invariants. For three-dimensional (3-D) objects, Biederman and Gerhardstein (1993) argued that the effects of viewpoint might be caused by the occlusion of different parts of an object or by the disappecirance of nonaccidentai properties, which are critical to determine the part category to which each part belongs (see also Farah, Rochlin, & Klein, 1994; Tarr & Bulthoff, 1995), For two-dimensional objects, the problem may be even more basic. Consider Figure la, which presents the projections of two planar shapes. No information is available on their 3-D orientation and position (together referred to as pose). If we assume pseudo-orthographic projection (no perspective), could these projections have resulted from the same shape? According to the mental transformation approach, the visual system is capable of simulating in 3-D space paths that correspond to combinations of 3-D rotations and translations of one projection, and then deciding whether there is a path that works out well and yields the other projection. In that case, the two projections are affine eqtiivalent, which means that one can be mapped onto the other by a plane affine transformation. According to the invariants-based approach, in contrast, the visual system is capable of finding features that are invariant under the group of transformations that relate both images, which in this case are affine invariants. In a recent series of experiments, participants were asked to match dot-pattern versions of these patterns (with dots at the vertices, one of which was marked as a reference point) under afftne transformations (Wagemans, Van Gool, Lamote, & Foster, 1996). Results demonstrated that the task could be done reasonably well (i.e., from 75% to 95% correct identifications, depending on the conditions), even with pattems that contained minimal information, but the evidence was mixed regarding the theoretical controversy between the mental transformation approach and the invariants-based approach. On the one hand, the elimination of one of the transformation components (i.e,, tilt) did not result in any appreciable improvement of general performance level (i.e., around 90% correct identifications in both conditions). It is difficult to reconcile this result with the use of mental transformations. On the other hand, performance was modulated rather strongly by some of the affine transformation parameters in some of the experiments (e,g,, response times increased from 2 s at 0° to 3,5 s at 180° rotation and error rates increased from 7% at 0° to 20% at 60° slant). One would not expect these effects from an invariants-based approach. Although the perceptual effects of the transformation parameters seem to argue against the invariant nature of the visual processing of shape equivalence, they do not rule out that invariants are used. Let us try to clarify this point for a particular type of invariants, affme coordinates, which could be used to solve the problem of affine shape equivalence illustrated in Fig232 Copyright © 19% American Psychological Society VOL. 7, NO. 4, JULY 1996 PSYCHOLOGICAL SCIENCE Johan Wagemans, Luc Van Gool, and Christian Lamote Fig, L Two simple shapes related by an affine transformation (a) and a demonstration of how affine-invariant coordinates can be used to determine the affine shape equivalence of such patterns (b). See the text for more detaiis. ure la. A triple of points suffices to define an affine coordinate frame (Koenderink & van Doorn, 1991; Ullman, 1989). One of the points plays the role of origin, while the other two define the coordinate axes and the unit lengths to be applied. Any additional point can then be given afftne coordinates, following the construction of Figure lb. Consider first the quadruple of dots drawn at the left. Suppose we take the dot in the lower left corner as the origin O, the one on the right as X, and the one in the upper left corner as Y. Draw a line from the hatched dot to OX, parallel to OY, and another line from the hatched dot to OY, parallel to OX. This yields two coordinates, x and v, that can be expressed as fractional numbers, Ox/OX and OylOY, respectively (0,50 and 0,75 for the example in Fig, lb). These coordinates are afiine invariant: The same fractions are obtained for all affme-equivalent patterns (e,g,, in the pattern on the right of Fig, lb, O'x'IO'X' is also 0,50 and O'y'lO'Y' is also 0,75), Because the coordinates are defined relative to the OX and Oy lengths, OX and OK are called unit vectors. The geometric construction underlying the definition of affine-invariant coordinates makes use of two well-known affine-invariant properties, namely, parallelism of lines and relative distances between three collinear points. The fact that such affine coordinates are afftne invariants need not imply that their extraction from a pattem will always take the same amount of time or be equally accurate. First, there is the problem of selecting the same points as basis and using them in the same role (i.e,, as origin or as defining the axes). Even with minimal patterns, the facilitation of finding the basis-point correspondences enhanced performance of subjects in detecting afftne shape equivalence (Wagemans et al,, 1996). In realistic, more complex shapes, the problem of choice is, of course, much larger. Second, it is fair to suspect that the extraction of the afftne coordinates will be easier for the pattem at the left in Figure 1 than for the pattern at the right, which is a particularly oblique view. Because we did not know of any empirical evidence demonstrating such effects directly, in our experiment we instructed subjects explicitly to give affme-invariant coordinates. To disentangle the point search and coordinate measurement problems as much as possible, we used pattems consisting of four points only, and three points were indicated explicitly and unambiguously as basis points (assuming no reflections). We aiso manipulated the configurations systematically to investigate whether the angle subtended by the coordinate axes and the projected unit lengths affected the accuracy of the subjects' measurements. If this were the case, the results would constitute good empirical support for the influence of object pose on the estimation of affine coordinates and for the more general thesis that the visual system's measurement of affine invariants need not itself be invariant. It would also follow that the frequently reported effects of parametric differences between shapes on the difficulty to assess their shape equivalence need not per se reflect the use of mental transformations.
منابع مشابه
New Improvement in Interpretation of Gravity Gradient Tensor Data Using Eigenvalues and Invariants: An Application to Blatchford Lake, Northern Canada
Recently, interpretation of causative sources using components of the gravity gradient tensor (GGT) has had a rapid progress. Assuming N as the structural index, components of the gravity vector and gravity gradient tensor have a homogeneity degree of -N and - (N+1), respectively. In this paper, it is shown that the eigenvalues, the first and the second rotational invariants of the GGT (I1 and ...
متن کاملNew Algorithm For Computing Secondary Invariants of Invariant Rings of Monomial Groups
In this paper, a new algorithm for computing secondary invariants of invariant rings of monomial groups is presented. The main idea is to compute simultaneously a truncated SAGBI-G basis and the standard invariants of the ideal generated by the set of primary invariants. The advantage of the presented algorithm lies in the fact that it is well-suited to complexity analysis and very easy to i...
متن کاملSplice Graphs and their Vertex-Degree-Based Invariants
Let G_1 and G_2 be simple connected graphs with disjoint vertex sets V(G_1) and V(G_2), respectively. For given vertices a_1in V(G_1) and a_2in V(G_2), a splice of G_1 and G_2 by vertices a_1 and a_2 is defined by identifying the vertices a_1 and a_2 in the union of G_1 and G_2. In this paper, we present exact formulas for computing some vertex-degree-based graph invariants of splice of graphs.
متن کاملVisual Observables and Invariance
This paper presents the visual measurement of physical object properties that characterize the perceived object including: size, shape, surface properties, cover reflectance properties, distance, and motion. We provide an overview of complete set of local spatial, spectral and temporal measurements. From local visual measurements and a physical model of the visual stimulus formation, we derive ...
متن کاملGingival Thickness Assessment: Visual versus Direct Measurement
Background and Aim: Several methods have been suggested to measure gingival thick-ness. This study aimed to assess the reliability of visual assessment of facial gingival biotype of maxillary and mandibular teeth with or without using a periodontal probe in comparison with direct measurement. Materials and Methods: Sixty-seven healthy patients (25 women and 42 men) with a total of 100 hopele...
متن کامل